Skip to content

Fix: public suffixes tld parsing#110

Merged
ato merged 2 commits into
iipc:masterfrom
adam-miller:fix_public_suffixes_tld_parsing
Jul 15, 2025
Merged

Fix: public suffixes tld parsing#110
ato merged 2 commits into
iipc:masterfrom
adam-miller:fix_public_suffixes_tld_parsing

Conversation

@adam-miller

Copy link
Copy Markdown
Collaborator

The Heritrix dependency crawler-commons-1.4 contains a newer version of the publicsuffix.org TLD list: effective_tld_names.dat in the root of their jar, which collides with ours, and our parser then fails while trying to parse the new version. This branch moves our copy of the file into a namespaced folder to reduce collisions, updates the logic to handle the encountered errors, adds testing for those cases, and updates to the current latest version of the public suffixes file.

@ato ato merged commit c57f059 into iipc:master Jul 15, 2025
5 checks passed
@ato

ato commented Jul 15, 2025

Copy link
Copy Markdown
Member

Thanks. Released as webarchive-commons 2.0.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants